Toward Rigorous Data Harmonization in Cancer Epidemiology Research: One Approach.
نویسندگان
چکیده
Cancer epidemiologists have a long history of combining data sets in pooled analyses, often harmonizing heterogeneous data from multiple studies into 1 large data set. Although there are useful websites on data harmonization with recommendations and support, there is little research on best practices in data harmonization; each project conducts harmonization according to its own internal standards. The field would be greatly served by charting the process of data harmonization to enhance the quality of the harmonized data. Here, we describe the data harmonization process utilized at the Fred Hutchinson Cancer Research Center (Seattle, Washington) by the coordinating centers of several research projects. We describe a 6-step harmonization process, including: 1) identification of questions the harmonized data set is required to answer; 2) identification of high-level data concepts to answer those questions; 3) assessment of data availability for data concepts; 4) development of common data elements for each data concept; 5) mapping and transformation of individual data points to common data elements; and 6) quality-control procedures. Our aim here is not to claim a "correct" way of doing data harmonization but to encourage others to describe their processes in order that we can begin to create rigorous approaches. We also propose a research agenda around this issue.
منابع مشابه
Maelstrom Research guidelines for rigorous retrospective data harmonization
Background It is widely accepted and acknowledged that data harmonization is crucial: in its absence, the co-analysis of major tranches of high quality extant data is liable to inefficiency or error. However, despite its widespread practice, no formalized/systematic guidelines exist to ensure high quality retrospective data harmonization. Methods To better understand real-world harmonization ...
متن کاملIs rigorous retrospective harmonization possible? Application of the DataSHaPER approach across 53 large studies.
BACKGROUND Proper understanding of the roles of, and interactions between genetic, lifestyle, environmental and psycho-social factors in determining the risk of development and/or progression of chronic diseases requires access to very large high-quality databases. Because of the financial, technical and time burdens related to developing and maintaining very large studies, the scientific commu...
متن کاملToward an integrated approach to molecular epidemiology.
The emergence of "molecular epidemiology" as a scientific approach within the fields of epidemiology and toxicology has led to spirited discussion within the biomedical community, particularly in the area of cancer research. At scientific meetings and in peer-reviewed journals, numerous issues have been raised not only with regard to the practice of molecular epidemiology, but also with regard ...
متن کاملA Generic Data Harmonization Process for Cross-linked Research and Network Interaction. Construction and Application for the Lung Cancer Phenotype Database of the German Center for Lung Research.
OBJECTIVE Joint data analysis is a key requirement in medical research networks. Data are available in heterogeneous formats at each network partner and their harmonization is often rather complex. The objective of our paper is to provide a generic approach for the harmonization process in research networks. We applied the process when harmonizing data from three sites for the Lung Cancer Pheno...
متن کاملA Generic Data Harmonization Process for Cross-linked Research and Network Interaction
Objective: Joint data analysis is a key requirement in medical research networks. Data are available in heterogeneous formats at each network partner and their harmoni zation is often rather complex. The objective of our paper is to provide a generic approach for the harmonization process in research networks. We applied the process when harmonizing data from three sites for the Lung Cancer Phe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- American journal of epidemiology
دوره 182 12 شماره
صفحات -
تاریخ انتشار 2015